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Technical  Accomplishments 

We  developed  an  improved  numerical  method  for  the  simulation  of  re-entry  vehicle  flow 
fields,  including  finite-rate  internal  energy  relaxation  and  chemical  reactions.  This 
method  is  applicable  to  the  hard-body  flow  field,  as  well  as  to  the  vehicle  wake.  The 
method  and  approach  are  documented  in  two  conference  papers  and  a  journal  article 
(attached)  [1,2].  The  computer  code  that  implements  the  method  was  transitioned  to 
AFRL  Space  Vehicles  Directorate  for  the  simulation  of  re-entry  vehicle  flow  fields. 

Prior  to  the  present  work,  existing  numerical  simulation  approaches  used  computational 
fluid  dynamics  (CFD)  methods  that  were  designed  for  chemical  kinetics  models 
involving  a  fairly  small  number  of  chemical  species  -  typically  about  10  or  less. 
However,  realistic  re-entry  vehicle  flows  may  involve  a  much  larger  number  of  species 
due  to  ablation  products,  trace  species  that  may  be  relevant  for  signatures,  and  ionization 
of  these  species.  This  can  result  in  perhaps  50  chemical  species  being  important  for 
certain  types  of  analyses.  The  computational  and  memory  costs  of  present  methods  scale 
with  the  square  of  the  number  of  chemical  species.  This  scaling  makes  the  simulation  of 
large  chemical  kinetics  models  extremely  onerous  and  limits  the  analyst’s  use  of  more 
detailed  kinetics  models. 

We  have  developed  a  new  computational  method  that  reduces  the  simulation  costs  using 
a  novel  approach.  The  idea  is  to  decouple  the  chemical  species  mass  conservation 
equations  from  the  flow  field  total  mass,  momentum,  and  energy  conservation  equations. 
This  significantly  reduces  the  cost  of  simulations  because  the  resulting  linear  system  of 
equations  is  simpler  to  solve.  However,  there  are  several  issues  involving  robustness  and 
solution  accuracy  that  had  to  be  resolved.  In  the  following,  we  give  a  brief  description  of 
the  approach;  full  details  are  available  in  the  attached  manuscript  that  was  published  as  an 
AIAA  Journal  article. 


Decoupled  Implicit  Method 

The  usual  approach  to  the  numerical  solution  of  thermo-chemical  nonequilibrium  flows  is 
to  solve  the  fully-coupled  species  mass,  momentum  and  energy  equations  using  an 
implicit  time  integration  method.  The  cost  of  this  approach  scales  quadratically  with  the 
number  of  equations  being  solved.  Thus,  as  the  number  of  chemical  species  increases,  the 
computational  cost  drastically  increases.  We  have  developed  an  approach  that  decouples 
the  species  mass  and  internal  energy  conservation  equations  from  the  total  mass, 
momentum  and  energy  equations.  Such  an  approach  is  commonly  used  in  the  combustion 
literature;  however,  we  have  found  that  those  methods  may  yield  different  results  from 
the  fully-coupled  approach.  The  following  outlines  the  new  decoupled  method. 

Consider  the  one-dimensional  Euler  equations  of  a  thermo-chemical  nonequilibrium  gas: 


Approved  for  public  release;  distribution  is  unlimited. 

1 


with: 


/Pi  \ 

/  Pm  \ 

(  Wl\ 

Pns 

F  = 

Pns ^ 

W  = 

U’na 

pu 

pu2  +  p 

0 

Ev 

Evu 

wv 

\  E  / 

\  (E  +  p)uj 

\  0  / 

Here,  we  have  neglected  the  viscous  fluxes  (mass  diffusion,  shear  stresses,  and  thermal 
conduction)  for  simplicity  in  the  following  derivation.  The  computational  methods 
include  all  of  these  terms  in  the  solution  of  the  governing  equations.  Conventional  fully- 
coupled  numerical  methods  solve  the  above  set  of  equations.  However,  consider  splitting 
this  equation  into  two  components: 
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Note  that: 


F  =  uU 

The  two  equations  are  mathematically  equivalent  to  the  original  conservation  law. 
In  discrete  finite-volume  form,  these  equations  may  be  written  as: 


dU  1  -  ~ 

+  jr-FS 
s 


with  the  sum  over  the  faces  of  the  finite -volume  element.  There  are  many  possible  ways 
to  solve  these  two  equations.  Consider  a  first-order  in  time  implicit  method  in  which  we 
linearize  the  flux  vectors  using: 
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Consider  solving  these  equations  in  two  steps.  One  such  approach  would  be  to  first  solve 
for  the  total  mass,  momentum  and  energy,  neglecting  the  term  due  to  the  linearization  of 
T  with  respect  to  U .  Then  we  have  the  implicit  problem: 


V_ 
A t 


Et)J~  -  \ — *  ~ 

— —su  =  ->r 
du  r 


The  solution  of  this  linear  system  provides  the  velocity  field  at  the  future  time  level,  n+ 1. 
Now,  using  the  fact  that  F  =  uU  we  can  write 

pi+i  =  u"+i  u”  .  §  +  un+l6Un  ■  S 

Then  the  second  part  of  the  flux  may  be  obtained  by  solving  the  system  of  equations: 

—SUn  +  V  un+16Un  .S-\'—5Un  =  -  V  un+lUn  ■  §+  VW1  +  V—L&Un 
At  "  dU  j  dU 

A  first  glance,  this  does  not  seem  to  be  a  substantial  improvement  over  the  fully-coupled 
solution.  However,  there  are  some  features  of  this  approach  that  are  quite  interesting. 

First,  note  that  the  cost  of  solving  the  fully-coupled  set  of  equations  varies  with  the 
number  of  equations  squared.  In  this  case,  that  would  be  (ns  +  5)2  for  a  three-dimensional 
problem  with  ns  chemical  species  and  one  internal  energy  equation.  The  solution  is 
typically  done  with  the  Data-Parallel  Line-Relaxation  (DPLR)  method,  involving  the 
iterative  solution  of  a  block  tridiagonal  system  of  equations  [3], 

For  the  decoupled  method,  the  same  DPLR  method  must  be  applied  to  the  total  mass 
momentum  and  energy  conservation  equations.  This  also  has  a  cost  that  scales 
quadratic  ally;  in  this  case  a  three-dimensional  problem  would  scale  as  52.  The  second  part 
of  the  problem  is  considerably  easier  to  solve.  The  flux  linearization  is  simple  and  as  a 
result,  the  block  tridiagonal  problem  is  reduced  in  complexity  and  cost.  Since  the 
Jacobian  matrix  of  the  source  vector  is  full,  the  cost  of  this  problem  will  scale  like  (ns  + 

1  )2,  but  with  a  substantially  smaller  constant  of  proportionality  than  the  full  DPLR 
solution.  Furthermore,  the  simplicity  of  the  implicit  problem  requires  less  memory  than 
the  fully-coupled  method.  Each  neighboring  element  requires  the  storage  of  a  Jacobian 
matrix  of  size  (ns  +  5)  x  (ns  +  5)  for  the  DPLR  method;  for  hexahedral  elements,  this  is 
six  Jacobian  matrices  as  well  as  the  diagonal.  Here,  we  require  the  storage  of  the 
corresponding  5x5  matrices  plus  a  single  (ns  +  1)  x  (ns  +1)  matrix.  The  extension  to 
viscous  flows  is  straight-forward  and  does  not  increase  the  cost  of  the  approach. 

There  are  several  technical  details  that  are  beyond  the  scope  of  the  present  report  (and 
involve  cumbersome  nomenclature).  We  implemented  the  decoupled  method  within 
US3D  [4,5],  and  have  found  that  it  converges  to  a  steady-state  at  approximately 
the  same  rate  as  the  fully-coupled  method  run  with  the  DPLR  method  in  the  US3D 
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computational  fluid  dynamics  code.  For  a  large  number  of  chemical  species,  the  cost  per 
time  step  is  reduced  by  a  factor  of  about  4.8.  The  memory  savings  are  also  large  (about 
a  factor  of  5)  because  only  a  single  large  Jacobian  matrix  must  be  stored,  instead  of  the 
usual  seven  matrices  for  the  fully  coupled  approach.  Details  of  the  cost  reduction  are 
presented  in  the  attached  paper. 

The  decoupled  approach  significantly  reduces  the  computational  and  memory  costs  of 
simulations  with  large  numbers  of  chemical  species.  For  example,  it  drastically  reduces 
the  cost  of  computing  flows  with  large  numbers  of  ablation  products,  or  high-enthalpy 
flows  with  many  ionized  species.  In  the  future,  there  are  many  possible  variants  to  this 
approach  that  could  be  explored  that  could  further  reduce  computational  costs  or  allow 
the  approach  to  be  extended  to  time-accurate  implicit  simulations. 

Overlay  Method  for  Trace  Chemical  Species 

In  the  original  proposal,  we  planned  to  implement  an  overlay  approach  for  flows  with 
trace  chemical  species.  The  overlay  approach  is  not  general,  and  only  applies  to  trace 
species  that  do  not  affect  the  bulk  properties  of  the  flow  (overall  density,  vibrational 
temperature,  and  temperature,  in  particular).  Because  of  the  success  of  the  new  decoupled 
method,  this  work  was  not  necessary;  it  is  now  far  better  to  simply  add  additional  species 
to  the  full  flow  field  calculation,  rather  than  running  separate  approximate  simulations  to 
account  for  the  trace  species  flow. 
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APPENDIX 


A  Decoupled  Implicit  Method  for  Aerothermodynamics  and  Reacting  Flows 

Graham  V.  Candler* 

Prarnod  K.  Subbareddy** 

Ioannis  Nompelisj 

Department  of  Aerospace  Engineering  and  Mechanics 
University  of  Minnesota 
110  Union  Street  SE 
Minneapolis,  MN  55455 

Abstract 

We  propose  a  new  implicit  computational  fluid  dynamics  method  for  steady-state  com¬ 
pressible  reacting  flows.  The  concept  is  to  decouple  the  total  mass,  momentum,  and  energy 
conservation  equations  from  the  species  mass  and  internal  energy  equations,  and  to  solve 
the  two  equation  sets  sequentially.  With  certain  approximations  to  the  implicit  system, 
it  is  possible  to  dramatically  reduce  the  cost  of  the  solution  with  little  to  no  penalty  on 
convergence  properties.  Importantly,  the  cost  of  the  decoupled  implicit  problem  scales  lin¬ 
early  with  the  number  of  species,  as  opposed  to  the  quadratic  scaling  for  the  conventional 
fully-coupled  method.  Furthermore,  the  new  approach  reduces  the  memory  requirements 
by  a  significant  factor.  The  decoupled  implicit  method  shows  promise  for  application  to 
aerothermodynamics  problems  and  reacting  flows. 

N  omenclature 

A  convective  flux  vector  Jacobian  matrix 

a  speed  of  sound  (m/s) 

C  source  vector  Jacobian  matrix 

cs  species  s  mass  fraction 

cvs  translational-rotational  specific  heat  of  species  s  ( J /kg  K) 

E  total  energy  per  volume  (J/m3) 

Ev  vibrational  energy  per  volume  (J/m3) 

ev  vibrational  energy  per  mass  (J/kg) 

F  convective  flux  vector 

F'  convective  flux  vector  normal  to  element  face 

Fv  diffusive  flux  vector 

/  value  at  element  face 

h°  heat  of  formation  per  unit  mass  ( J /kg) 

i,j  grid  index 

k  relaxation  step  index 

L,R  values  obtained  from  left  and  right  data 

*  Russell  J.  Penrose  and  McKnight  Presidential  Professor,  Fellow  AIAA 

**  Post-Doctoral  Researcher,  Member  AIAA 
f  Research  Associate,  Member  AIAA 
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element  on  solution  line 

Jacobian  matrices  for  linearization  of  viscous  flux 
molar  mass  of  species  s  (kg/kmol) 
time  step 

number  of  chemical  species 
pressure  (Pa) 

universal  gas  constant  (J/kmolK) 
element  face  area  vector,  magnitude  (nr2) 
components  of  face  normal  vector 
temperature  (K) 
time  (sec) 

vectors  of  conserved  variables 
components  of  velocity  (m/s) 
face-normal  velocity  (m/s) 
element  volume  (nr3) 

vector  of  species  mass  fractions  and  vibrational  energy 

vector  of  source  terms 

eigenvector  matrix 

diagonal  matrix  of  flux  correction,  e 

density  flux  correction 

diagonal  matrix  of  eigenvalues 

convective  eigenvalue 

density,  species  s  density  (kg/m3) 

characteristic  time  (sec) 

diagonal  elements  of  C 

flux  directions 


I.  Introduction 

The  solution  of  complex  geometry  nonequilibrium  flows  has  become  commonplace  over  the  past  decade. 
The  most  efficient  methods  use  implicit  time  integration  to  solve  the  governing  equations  because  of  the 
stiffness  associated  with  resolving  cold-wall  boundary  layers  and  finite-rate  chemical  kinetics.  Larger,  more 
complex  chemical  kinetics  models  are  beginning  to  be  used,  partly  driven  by  the  need  for  more  accurate 
gas-surface  interaction  models.  For  example,  recent  work  of  Martin  and  Boyd1  proposes  a  38  species, 
158  reaction  chemical  kinetics  model  for  a  pyrolyzing  ablator.  For  some  problems  it  may  be  necessary 
to  model  excited  electronic  states  and  trace  species,  which  would  further  increase  the  computational  costs 
beyond  standard  chemical  kinetics  models  for  air  or  other  planetary  atmospheres.  Furthermore,  recent  work 
on  vibrational  state-specific  models  require  very  large  numbers  (potentially  hundreds)  of  chemical  species, 
making  it  impossible  to  apply  these  approaches  to  realistic  geometries. 

The  computational  cost  of  current  implicit  methods  for  aerothermodynamics  problems  scales  quadrati- 
cally  with  the  number  of  equations  being  solved.  Thus,  increasing  the  number  of  chemical  species  drastically 
increases  the  cost  of  a  simulation.  Furthermore,  these  methods  typically  require  the  storage  of  up  to  seven 
large  Jacobian  matrices,  which  incurs  large  memory  costs. 
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Popular  implicit  methods  for  aerothermodynamics  solve  the  governing  equations  in  a  fully-coupled  fash¬ 
ion.  That  is,  the  species  mass  conservation  equations,  momentum  equations,  internal  energy  equation(s), 
and  total  energy  equation  are  all  solved  simultaneously  as  one  large  linear  system  of  equations.  For  example, 
the  clata-parallel  line-relaxation  (DPLR)  method2  requires  the  solution  of  block  tridiagonal  systems  of  equa¬ 
tions,  where  the  blocks  have  dimension  of  (ns  +  5)  x  (ns  +  5),  assuming  a  three-dimensional  problem  with  a 
single  internal  energy  equation,  and  ns  is  the  number  of  chemical  species.  This  line-relaxation  approach  gives 
excellent  convergence  to  steady-state,  but  the  cost  of  the  block  tridiagonal  solution  becomes  more  and  more 
onerous  as  the  number  of  species  increases.  The  popular  LAURA  implicit  method  of  Gnoffo3  has  similar 
quadratic  cost  scaling  with  the  number  of  equations  being  solved. 

In  this  paper,  we  propose  a  new  decoupled  implicit  method  that  significantly  reduces  the  cost  of 
aerothermodynamics  problems,  while  retaining  the  excellent  convergence  properties  of  the  fully  coupled 
line-relaxation  approach.  We  show  that  the  cost  of  this  approach  has  nearly  linear  scaling  with  the  number 
of  species  in  the  kinetics  model.  Furthermore,  it  requires  the  storage  of  just  one  large  Jacobian  matrix,  which 
significantly  reduces  the  memory  requirements.  With  this  method,  the  computational  costs  are  dominated 
by  the  evaluation  of  the  chemical  source  term  and  its  Jacobian. 

The  concept  of  the  proposed  method  is  to  separate  the  mass,  momentum,  and  energy  equations  from 
the  species  mass  and  internal  energy  equations.  The  first  set  of  equations  is  solved  with  the  DPLR  method, 
involving  the  solution  of  5  x  5  (for  three  dimensions)  block  tridiagonal  systems.  Then,  the  implicit  problem 
for  the  species  mass  and  internal  energy  equations  is  approximated,  resulting  in  a  scalar  tridiagonal  system 
of  equations  to  be  solved.  This  drastically  reduces  the  cost  of  the  implicit  system  solution. 

The  proposed  method  is  related  to  several  previous  approaches  aimed  at  reducing  the  cost  of  solving 
problems  with  stiff  finite-rate  chemical  kinetics.  Bussing  and  Murnran4  developed  an  approach  in  which 
the  convective  fluxes  were  computed  explicitly  and  the  chemical  source  term  was  solved  implicitly,  reducing 
issues  with  stiffness.  Shuen  and  Yoon5  made  some  modifications  to  the  Yoon  and  Jameson6  lower-upper 
successive  overrelaxation  (LU-SSOR)  implicit  method  to  extend  it  to  the  simulation  of  chemically  reacting 
flows.  This  approach  results  in  a  mostly  scalar-tridiagonal  system  of  equations  to  be  solved,  and  requires 
the  inversion  of  a  ns  x  ns  block  matrix  due  to  the  source  term  Jacobian.  Park  and  Yoon7  developed  an 
approach  that  takes  advantage  of  elemental  conservation  to  reduce  the  cost  of  the  matrix  inversion,  though 
their  approach  is  difficult  to  extend  to  arbitrary  chemical  kinetics  models.  Eberhardt  and  Imlay8  proposed 
a  modification  to  these  methods  to  eliminate  the  need  for  the  matrix  inversion  by  replacing  the  full  source 
term  Jacobian  with  a  diagonal  matrix.  This  approach  is  appealing  because  it  has  a  very  low  operation  count, 
but  it  is  not  robust  and  may  result  in  lack  of  elemental  conservation  due  to  large  disparities  in  chemical  time 
scales.  Candler  and  Olynick9  proposed  a  modification  to  the  Eberhardt  and  Imlay  approach  using  elemental 
conservation  that  improves  the  robustness  of  the  method,  but  these  approaches  suffer  from  poor  convergence 
on  highly  stretched  grids  due  to  the  approximations  made  in  the  development  of  the  LU-SSOR  and  later 
LU-SGS10  implicit  methods.  Other,  more  recent  approaches  for  aerothermal  and  reacting  flow  simulations 
use  multigrid  approaches  and  approximations  to  the  source  term  Jacobian  to  reduce  simulation  costs.11,12 
In  work  related  to  the  approach  proposed  here,  Schwer  et  al.13  developed  a  operator  splitting  approach 
that  involves  a  chemical  state  time  integration  followed  by  a  fluid  dynamic  time  integration.  An  ordinary 
differential  equation  (ODE)  solver  is  used  for  the  first  step,  and  a  diagonalized  alternating-direction  implicit 
(ADI)  method  is  used  for  the  second  step.  Recently,  Katta  and  Roquemore14  developed  a  senri-implicit 
method  for  very  large  chemical  kinetics  models,  in  which  a  portion  of  the  source  term  Jacobian  is  included  in 
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the  time  integration.  Lian,  Xia,  and  Merkle15  studied  the  integration  of  problems  with  strong  source  terms, 
and  obtained  improved  convergence  by  modifying  the  source  term  evaluation  and  limiting  the  effective  time 
step. 

The  present  approach  builds  on  the  ideas  and  approaches  presented  in  this  previous  work;  it  will  be 
shown  that  the  proposed  method  maintains  the  excellent  convergence  properties  and  lack  of  sensitivity  to 
grid  stretching  of  the  DPLR  method,  while  dramatically  reducing  the  cost  of  the  implicit  solution  for  large 
chemical  kinetics  models.  In  the  remainder  of  the  paper,  we  provide  a  brief  derivation  of  the  DPLR  method 
for  the  fully-coupled  equations.  We  then  derive  the  decoupled  approach  and  outline  the  proposed  solution 
strategy.  We  then  apply  the  method  to  several  test  problems  to  assess  its  convergence  properties  and  measure 
its  computational  costs  versus  the  fully-coupled  DPLR  method. 


II.  Background:  The  Data-Parallel  Line  Relaxation  Method 


Consider  a  multi-species  mixture  of  reacting  gases.  The  inviscid  conservation  equations  can  be  written 
as: 

dU 

=  W  (1) 

This  can  be  written  in  finite-volume  form  as: 


dU 

dt 


f 


(2) 


where  the  summation  is  over  the  faces  of  the  element;  V  is  the  volume  of  the  element  and  S  is  the  outward¬ 
pointing  surface  normal  vector,  S  =  S(sxi  +  syj  +  szk).  For  convenience,  define 


The  vector  of  conserved  quantities  and  the  surface-normal  flux  vector,  F',  are: 


/  Pi  \ 

/  Piu  \ 

Pns 

PnsU' 

pu 

,  F'  = 

puu'  +  psx 

pv 

pvu'  +  psy 

pw 

pwu'  +  psz 

Ev 

Evv! 

V  E  ) 

V  (E  +p)u'  / 

Where  u'  =  usx  +  vsy  +  wsz,  Ev  is  the  vibrational  energy  per  unit  volume,  and  E  is  the  total  energy  per 
unit  volume.  E  is  defined  as 


where  cvs 
by: 


ns  ^  ns 

E  =  y  '  pscvsT  +  Ev  +  — p(u 2  +  v2  +  w2)  +  'y  '  psh° 


(5) 


is  the  specific  heat  of  species  s  and  h°  is  its  heat  of  formation  per  unit  mass.  The  pressure  is  given 


-i 


M, 


(6) 
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where  R  is  the  universal  gas  constant  and  Ms  is  the  species  s  molar  mass. 


One  approach  to  evaluate  the  numerical  fluxes  is  to  use  a  modified  form  of  Steger-Warming  flux-vector 
splitting.16  It  should  be  noted  that  although  we  focus  on  this  particular  numerical  flux  function,  the  approach 
is  extensible  to  other  upwind-biased  flux  methods.  The  modified  Steger-Warming  flux  is  based  on  the  fact 
that  the  flux  vector  is  homogeneous  in  U,  and  therefore,  we  have: 


dF' 

F'  =  WU  =  AU 


The  Jacobian  matrix,  A,  can  be  diagonalized  and  written  as: 

A  =  X~xAX 


(7) 

(8) 


where  A  is  the  diagonal  eigenvalue  matrix.  The  modified  Steger-Warming  flux- vector  splitting  method  obtains 
the  direction  of  the  fluxes  by  splitting  the  eigenvalue  matrix  into  its  positive  and  negative  components,  A ±. 
The  split  Jacobians,  A  ,  are  then  formed  from  A  .  Then,  we  can  obtain  an  upwind-biased  numerical  flux 
across  a  finite-volume  face  using: 

F,f  =  A+fUL  +A~fUR  (9) 

where  the  L  and  R  superscripts  indicate  left-  and  right-biased  extrapolations  of  the  conserved  variables  to  the 
face.  The  split  Jacobians  are  evaluated  using  left  and  right  data  averaged  to  the  element  face,  /.  Unlike  the 
original  Steger-Warming  approach,17  they  are  evaluated  with  the  same  data,  which  produces  less  dissipation 
and  a  more  accurate  flux.16 


Now  consider  an  implicit  time  integration  method,  in  which  we  evaluate  the  fluxes  and  source  vector  at 
the  future  time  level: 

dUrl 


+  ( F,n+1S)f  =  Wn+1 


dt  V 


f 


and  then  we  linearize  using: 


r)F'n 

pln+l  ^  pin  +  y _  SJJn 

dU 

, ,  dWn 

wn+1  ~Wn  +  -  SUn 

oU 

with  6Un  =  Un+1  —  Un .  Then  the  discrete  form  of  the  conservation  equation  becomes: 


SUn 


At  V 


dW’ 


+  -  J2  ( A+fSUL  +  A~f6UR)nSf  5Un  =  --J2  (F'nS)  +  W' 


V 


(10) 


(11) 


(12) 


This  problem  can  be  solved  with  a  variety  of  methods;  one  approach  is  the  data-parallel  line-relaxation 
(DPLR)  method.2  This  approach  solves  a  line-relaxation  problem  along  lines  of  elements  near  solid  surfaces; 
the  off-line  terms  on  the  left-hand  side  of  the  equation  are  updated  during  the  relaxation  process.  This 
method  involves  a  series  of  relaxation  steps  for  5Un  and  can  be  written  as: 

Jt/(0)  =  0 


for  k  =  1,  fcmax 

+  1  ^  ( A+f6UL  +  A-fSUR)(k)Sf  -  ^  5U{k)  =  -y  ( F'nS)f  +  Wn 

V  f=t  f 

( A+fSUL  +  A~f 5UR)^k~V> 

V 

end 

SUn  =  6U (fcmax) 


(13) 
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where  f  =  t  indicates  the  contribution  to  the  left-hand  side  of  the  implicit  problem  due  to  the  change  in 
the  solution  on  the  line  of  elements.  Note  that  because  the  implicit  problem  solves  along  a  line  of  elements, 
SUL  and  SUR  involve  each  element  and  two  of  its  neighbors.  This  results  in  a  block  tridiagonal  system  of 
equations  to  be  solved  for  each  line  of  SU.  Here,  fcmax  is  the  number  of  relaxation  steps,  which  is  typically 
taken  as  4  for  optimal  convergence.2 

Here  we  have  not  included  the  viscous  fluxes,  however  it  is  straight-forward  to  do  so.  Details  may  be 
found  in  Wright  et  al.2  The  approach  is  to  linearize  the  n  +  1  viscous  flux,  F™+1  as: 

F™+1  =  F™  +  5F™  ~  F ™  +  M-^-(NSU)  (14) 

where  77  is  the  direction  normal  to  the  face,  and  M  and  N  are  Jacobian  matrices.  This  results  in  additional 
matrices  added  to  the  inviscid  flux  vector  Jacobians  in  the  above  expression;  this  does  not  change  the 
structure  of  the  implicit  problem  to  be  solved. 

This  approach  is  particularly  effective  for  high-Reynolds  number  flows  where  the  near-wall  boundary 
layer  must  be  resolved  with  highly-stretched  grids.  It  has  been  shown  that  this  method  can  be  run  at  large 
time  steps  and  converges  to  the  steady-state  solution  in  a  number  of  time  steps  that  is  essentially  independent 
of  the  Reynolds  number  or  the  wall-normal  grid  stretching. 

The  line-relaxation  approach  is  much  less  expensive  than  solving  the  full  linear  system  of  equations. 
Recent  work  has  compared  this  method  to  a  direct  solve  using  the  Generalized  Method  of  Residuals 
(GMRes). 18,19  It  was  found  that  the  DPLR  method  is  less  costly  except  for  certain  problems  where  it 
is  difficult  or  impossible  to  construct  grids  with  lines  of  grid  elements  in  the  wall-normal  direction.  However, 
because  the  main  cost  of  the  DPLR  method  are  the  block  tridiagonal  solves,  the  computational  costs  (both 
compute  time  and  memory)  of  the  DPLR  method  scale  quadratically  with  the  number  of  equations  being 
solved.  Thus,  its  cost  can  become  onerous  when  the  number  of  chemical  species  is  large. 


III.  A  Decoupled  Implicit  Method 


Consider  an  extension  to  the  DPLR  method  that  involves  splitting  the  conservation  equations  into  two 
parts.  Define  U  as  the  vector  of  total  mass,  momentum,  and  energy,  and  U  as  the  species  mass  and  total 
vibrational  energy  per  unit  volume: 


(p) 

(*\ 

pu 

pv 

,  u  = 

pw 

\E  ) 

Pus 

\EVJ 

(15) 


with  corresponding  flux  vectors,  F  and  F.  Now  solve  the  governing  equations  in  two  steps.  First,  solve  for 


the  total  variables: 


dU 

~dt 


±Y,(F'n+1S)f  =  0 
/ 


(16) 


and  then 


+  -  y  (F,n+is)f  =  wn+i 
at  V  “  v  7 


(17) 


The  first  phase  of  the  solution  proceeds  using  the  standard  DPLR  method,  with  a  series  of  block- 
tridiagonal  solves  and  relaxation  steps  for  the  off-line  terms,  as  discussed  above.  The  U  variables  are 
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updated  to  the  n  +  1  time  level,  and  the  non-conserved  (velocity,  pressure,  temperature,  speed  of  sound, 
etc.)  variables  are  updated  assuming  that  the  thermo-chemical  state  (species  mass  fractions  and  vibrational 
energy)  is  fixed  at  the  n  state. 


The  solution  for  the  U  variables  is  somewhat  different,  and  can  be  derived  starting  with  the  discrete 
form  of  the  governing  equation: 


SUn 

~At 


/ 


Wn+1 


(18) 


We  have  already  solved  for  the  total  mass,  momentum,  and  energy  at  time  level  n  +  1,  so  we  can  rewrite 
SUn  as 

5Un  =  pn+1Vn+1  -  pnVn  =  pn+16Vn  +  VnSpn  (19) 

We  have  defined  V  =  (ci, . . . ,  cns,  ev)T ,  with  cs  =  ps/p  the  mass  fraction  of  species  s  and  ev  the  vibrational 
energy  per  unit  mass,  ev  =  Ev/p. 


For  modified  Steger- Wanning  flux  vector  splitting,  the  flux  of  the  V  variables  is  proportional  to  the  flux 
of  total  mass  and  the  total  density  at  the  cell  face.  Specifically,  the  flux  of  species  s  mass  and  vibrational 
energy  are: 


F'/s  =  (csF'p)f  +  (cLs  -  c{)pL\tf  +  (cf  -  c{)pR \-y 
K{  =  ( evFp)f  +  (eR  -  e{)pL xy  +  (eR  -  e{)pR X^f 


(20) 


where  are  the  eigenvalues  of  the  Jacobian  matrices  A±  corresponding  to  convection.  We  choose  to 
approximate  F'n+1  with 


p\/n+l 


F'n  +  (pLX+f)n+1SVLn  +  (pRX[f)n+1dVRn 


(21) 


Now  consider  the  linearization  of  the  source  vector,  W.  which  must  be  consistent  with  the  time  level 
of  the  variables,  U  and  V .  The  U  variables  have  been  updated  to  time  level  n  +  1,  while  the  V  variables 
remain  at  n.  The  source  vector  is  a  function  of  U  and  V  variables,  and  therefore  may  be  written  as 


wn+1  =  W(Un+\  Vn+1)  ~  W(Un+\  Vn)  +  — 


^ svn 

udV 


(22) 


where  the  Jacobian  matrix  is  computed  holding  the  U  variables  fixed  at  n  +  1.  Let 


dW  dU 
dU  0dV 


(23) 


Substituting  the  above  expressions  into  the  discrete  conservation  equation,  we  obtain 

p"+1—  +  A.^2{pL\+f8VL  +  pR\if8VR}  ’  Sf  -  cn’n+18Vn  = 

f  -|  c  n  (24) 

_ _  'y  ^  ^Frn,n+l f  _|_  j^/n,n+ 1  _  y-n 

V  f  At 

The  n  +  1  values  of  the  U  variables  and  the  non-conserved  variables  are  used  wherever  possible  in  this 
expression.  The  solution  of  this  equation  yields  the  mass  fractions  and  vibrational  energy  at  time  n  +  1.  The 
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non-conserved  variables  are  re-evaluated  with  the  updated  thermo-chemical  state  and  the  solution  proceeds 
to  the  next  time  step. 

However,  there  is  an  additional  requirement  on  the  system  of  equations  that  must  be  considered.  We 
know  that  the  mass  fractions  must  sum  to  unity,  and  therefore  the  change  in  the  mass  fractions  must  sum 
to  zero: 

E  c*  =  1  ^  5cs  =  0  (25) 

S  S 

This  is  a  constraint  on  the  solution,  and  is  effectively  a  result  of  solving  one  extra  conservation  equation. 
We  can  sum  the  above  discrete  equation  for  6cs  over  all  species  to  obtain: 


pn+l 

At 


I>:+ 


vU’ 

f 


LKfJ2Scs 


+p‘ 


n,n+ 1 


sf 


i 

V 


EE(^+ls)f 

/  * 


Spn 
A t 


(26) 


Note  that  the  sum  over  the  source  vector  is  zero  because,  by  construction,  it  does  not  change  the  total 
mass.  Since  &cs  =  0,  the  left-hand  side  to  the  above  equation  must  be  zero  (since  it  is  proportional  to 
Scs  =  0).  Thus,  to  satisfy  the  constraint  on  Sc8,  the  right-hand  side  of  this  expression  must  also  be  zero. 
In  the  limit  of  an  infinitely  accurate  discretization,  the  face  values  of  cs  approach  the  cell-centered  values, 
and  the  expression  would  sum  exactly  to  zero.  However  in  discrete  form,  the  face  values  will  be  different  and 
the  term  will  not  sum  precisely  to  zero.  Therefore,  we  can  enforce  the  constraint  by  forming  the  quantity,  e, 
by  summing  over  all  species: 


e 


1 

V 


EEK'”‘S)' 

/  « 


Spn 

At" 


(27) 


To  impose  this  constraint  on  the  system  of  equations,  define  £  as  a  diagonal  matrix  with  e  on  the  diagonal, 
except  for  the  vibrational  energy  equation,  which  has  a  zero  entry  on  the  diagonal.  We  then  subtract  £V 
from  the  right-hand  side  of  the  discrete  equation  to  obtain: 


P 


n+1 


sv 

A t 


y  J2{pLxTf6VL  +  PR XTf8VRY'n+1Sf  -  Cn’n+16Vn  = 
f 

1  f  fin71 

—  y'  / p/n,n+ 1S\  3  ,  Wn,n+ 1  _  yn  - gyn 

Vj  At 


(28) 


This  is  the  final  form  of  the  implicit  decoupled  method  for  an  inviscid  reacting  flow.  Note  that  as  the  solution 
converges  to  steady-state,  e  approaches  zero  because  dp  approaches  zero  and  the  total  mass  flux  balances 
at  steady-state.  Furthermore,  when  8V  and  Sp  approach  zero,  the  original  conservation-law  form  of  the 
species  mass  equations  are  obtained  and  therefore  species  mass  and  elements  are  conserved  by  the  steady 
state  solution.  Thus,  e  is  used  to  maintain  stability  and  physical  consistency  during  convergence. 

One  additional  point  must  be  made  concerning  the  consistency  of  the  solution  between  the  two  sets  of 
variables.  It  is  obvious  that  we  require  that  the  sum  of  the  species  mass  fluxes  be  exactly  equal  to  the  total 
mass  flux.  That  is  when  the  fluxes  are  evaluated  with  the  same  data,  we  obtain: 


E  (PPsS)f  =  (K)f  (29) 

S 

However,  note  that  the  fluxes  depend  on  the  variation  of  the  thermodynamic  properties  between  the  cell 
centers  and  faces.  This  variation  must  be  included  in  the  evaluation  of  F "p  so  that  it  is  consistent  with  the 
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species  mass  fluxes.  In  practice,  this  requires  forming  the  species  mass  fluxes  and  summing  them  to  obtain 
the  total  mass  flux.  If  this  is  not  enforced,  the  decoupled  solution  will  differ  from  the  fully-coupled  result. 

Including  the  viscous  fluxes  in  the  calculation  is  straight-forward.  For  the  total  variables,  we  linearize 
the  viscous  flux  using  a  standard  approach.3  This  results  in  dense  Jacobian  matrices  that  are  added  to 
the  implicit  problem  without  increasing  its  solution  cost.  For  the  V  variables,  the  viscous  flux  Jacobian 
matrix  is  almost  diagonal.  With  Fickian  diffusion  (species  mass  diffusion  proportional  to  the  species  mass 
fraction  gradient),  the  only  non-zero  elements  for  the  species  equations  are  on  the  diagonal.  For  the  diffusive 
vibrational  energy  transport,  there  is  a  flux  due  to  the  diffusion  of  each  species  vibrational  energy.  These 
terms  result  in  non-zero  off-diagonal  entries  in  the  viscous  flux  Jacobian;  however,  through  testing  we  have 
found  that  they  can  be  neglected  without  adversely  affecting  the  convergence  properties  of  the  method. 
Thus,  the  addition  of  the  viscous  fluxes  augments  the  flux  vector  and  modifies  the  diagonal  elements  of  the 
implicit  system  of  equations.  Therefore,  it  does  not  change  the  cost  of  solving  the  decoupled  system. 


IV.  Solution  of  the  Decoupled  Implicit  Problem 

It  is  not  immediately  obvious  that  the  decoupled  approach  will  result  in  a  large  computational  savings. 
We  have  apparently  traded  one  large  implicit  problem  for  two  smaller  problems,  potentially  with  limitations 
on  the  stability  due  to  the  decoupling  of  the  two  equation  sets.  However,  we  have  found  that  it  is  possible  to 
dramatically  reduce  the  difficulty  of  solving  for  the  V  variables  by  making  some  additional  approximations 
to  the  implicit  problem. 

In  the  final  expression  obtained  above,  note  that  the  off-diagonal  terms  in  the  linear  system  of  equations 
for  SV  are  diagonal  matrices  of  the  form  pf  Ai/ ,  plus  the  contribution  due  to  the  linearization  of  the  viscous 
fluxes.  Also,  the  source  vector  Jacobian  matrix  is,  in  general,  a  dense  matrix  because  any  species  can  react 
with  any  other  species.  However,  this  matrix  has  a  particular  structure  due  to  the  form  of  the  law  of  mass 
action.  Note  that  the  destruction  of  a  chemical  species  is  proportional  to  itself,  and  that  its  formation  rate 
primarily  depends  on  different  chemical  species.  Thus,  when  moved  to  the  implicit  side  of  the  equation,  the 
source  term  Jacobian  increases  the  diagonal  elements  and  decreases  the  off-diagonal  elements  of  the  matrix 
operating  on  SV.  This  observation  is  used  in  the  work  of  Katta  and  Roquemore14  in  their  semi-implicit 
method.  The  Landau- Teller  expression  for  the  vibrational  energy  relaxation  shares  this  property.  Therefore, 
we  can  use  this  property  to  split  the  source  vector  Jacobian  into  its  diagonal  elements  and  its  non-diagonal 
elements  to  make  a  much  simpler  implicit  problem.  Let  us  define  x  =  diag(C).  Then,  we  have: 


P 


n+1 


sv 

A t 


V  '52{pLxtf5VL  +  pRKfSVRY'n+1Sf  -  xn'n+1SV{k) 

f  L 

1  f  f)nn 

^2  (F,n’n+1sy  +  wn’n+1  -  Vn-^-jr  -  £Vn  +  (C  -  x)n’n+1SVn 


(30) 


For  simplicity,  we  have  not  included  the  viscous  terms  in  this  expression.  However,  its  form  is  unchanged 
when  these  terms  are  added.  This  system  of  equations  can  be  solved  with  a  modified  version  of  the  DPLR 
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method,  given  by: 
,5R(0)  =  0 


for  k  =  1 ,  fcrnax 

pU+1^K~  +  V  J2{pLxifs^L  +  PR^if6VR}{k)n+1Sf  -  xn'n+15V(k)  =  -VF  (F,n’n+1S)f  +  wn'n+1 
1  V  f 

_yn5_P_  gyn  __  j_  ^  j  pL\+fSVL  +  pR\^f5VR  }  ^  +  {C  -  X)n’n+1 

*  v  m L 

end 

SVn  =  SV{k  max) 

(31) 

Note  that  all  of  the  terms  on  the  left-hand  side  of  this  equation  are  diagonal  matrices,  making  the  line-solve 
a  simple  scalar  tridiagonal  problem. 

V.  Relative  Costs  of  the  Methods 

To  more  clearly  illustrate  the  savings  of  this  approach,  consider  a  schematic  of  the  line-solve  that  must 
be  performed  by  the  fully-coupled  DPLR  method.  For  simplicity,  let  us  assume  that  we  are  computing 
on  a  two-dimensional  i,j  structured  grid.  Then,  for  the  fully-coupled  problem  we  must  solve  the  following 
expression  along  each  constant  j-line  of  data: 


Here,  the  □  represents  a  Jacobian  matrix  of  dimension  ne  x  ne,  where  ne  is  the  number  of  equations  being 
solved.  We  factor  the  block  tridiagonal  system  once,  and  then  perform  a  backwards  substitution  for  each 
relaxation  step.  The  cost  of  solving  this  system  scales  quadratically  with  the  number  of  equations  being 
solved. 

For  the  decoupled  method,  we  solve  one  small  system  as  shown  above  (the  block  matrices  are  4x4  for 
2-D  and  5x5  for  3-D)  and  then  we  solve  a  problem  that  can  be  represented  schematically  as: 


(33) 
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Where  the  \  represents  a  diagonal  matrix,  and  S  a  square  matrix  with  zeros  on  the  diagonal.  These 
matrices  have  dimension  of  (ns  +  1)  x  (ns  +  1).  Clearly,  this  is  a  much  simpler  system  whose  solution  cost 
is  almost  linear  in  the  number  of  equations  (the  matrix-vector  multiply  on  the  right-hand  side  is  the  only 
quadratic  operation) . 

The  number  of  sub- iterations,  fcmax,  used  in  this  update  needs  to  be  chosen  for  optimal  convergence  and 
computational  cost.  Since  the  matrix-vector  multiply  on  the  right-hand-side  of  (33)  is  the  only  term  with 
quadratic  scaling,  limiting  the  number  of  sub-iterations  may  be  cost-effective.  We  find  that  for  cases  studied 
here,  setting  fcmax  to  1  or  2  is  usually  optimal,  and  additional  sub-iterations  do  not  significantly  improve 
the  convergence  rate.  Thus,  in  the  remainder  of  the  paper  we  use  fcm ax  =  2.  For  large  numbers  of  chemical 
species,  the  computational  cost  could  be  reduced  slightly  by  using  a  smaller  value  of  fcmax. 

Figure  1  plots  the  computational  cost  (arbitrary  units  of  time)  of  the  two  implicit  solution  approaches 
for  a  two-dimensional  problem  as  a  function  of  the  number  of  chemical  species  in  the  kinetics  model.  Here, 
we  plot  the  cost  of  the  fully-coupled  implicit  solve  using  the  DPLR  method  versus  the  sum  of  the  small  block 
solve  and  the  decoupled  scalar  tridiagonal  solve.  Note  that  the  fully-coupled  method  scales  quadratically  with 
the  number  of  chemical  species,  while  the  decoupled  approach  shows  essentially  linear  scaling,  as  expected. 
Figure  2  plots  the  speedup,  which  is  the  cost  of  the  fully-coupled  implicit  DPLR  solve  divided  by  the  total 
cost  of  the  decoupled  implicit  solve.  The  speedup  is  essentially  linear  in  the  number  of  equations,  and  reaches 
very  large  values  for  large  numbers  of  chemical  species. 

For  the  DPLR  implicit  problem  shown  above,  each  grid  point  requires  the  storage  of  5  block  matrices 
for  the  2-D  problem  (7  in  3-D).  For  large  chemical  models,  this  represents  the  largest  memory  cost  of  the 
method,  and  this  requirement  can  limit  the  size  of  problem  that  may  be  run  on  a  given  machine.  For  the 
decoupled  approach,  the  small  DPLR  problem  for  the  total  variables  is  relatively  inexpensive,  and  only  a 
single  block  Jacobian  matrix  must  be  stored  for  the  species  conservation  equations.  Thus,  in  the  limit  of 
large  chemical  models,  the  relative  memory  cost  of  the  decoupled  method  will  be  approximately  1/5  for  2-D 
and  1/7  for  3-D. 


VI.  Test  Cases 


A.  Blunt-Body  Flows 

We  have  implemented  the  proposed  decoupled  implicit  method  in  a  parallelized  unstructured  grid  code20 
for  three-dimensional  flows;  this  code  can  also  be  run  using  the  fully-coupled  DPLR  method.  A  series  of  test 
cases  was  run  using  a  simple  128  x  128  grid  on  a  sphere-cone  geometry  (10  cm  nose  radius,  1.1m  long,  8°  cone 
angle).  The  baseline  free-stream  conditions  were  taken  as:  p ^  =  0.001  kg/m3,  Tx  =  226 K,  a  Mach  number 
of  15,  and  an  isothermal  wall  of  500  K.  The  grid  spacing  at  the  surface  was  chosen  so  that  the  cell  Reynolds 
number  based  on  wall  conditions  was  less  than  2,  which  results  in  sufficient  boundary  layer  resolution  for 
accurate  heat  transfer  rate  predictions.  Additional  cases  were  run  to  examine  the  relative  performance  of 
the  proposed  method  as  the  free-stream  conditions  are  varied. 

Several  chemical  kinetics  models  were  used  in  the  simulations.  They  range  from  the  most  basic  single¬ 
species  gas  model  with  N2  and  a  single  vibrational  energy  mode  to  a  38  species  air-carbon-hydrogen  model 
with  49  vibrational  energy  modes  and  158  chemical  reactions.  We  used  a  5-species  air  model  with  the  species 
N2,  02,  NO,  N,  O,  and  5  reactions;  the  11-species  air  model  adds  Nj ,  Oj,  NO+,  N+,  0+,  and  e“  to  those 
species,  and  includes  a  total  of  19  reactions.  We  also  implemented  a  21-species  air-carbon-hydrogen  model 
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that  augments  the  above  model  with  CO2,  CO,  C2,  C3,  CN,  H2,  HCN,  C,  C+,  and  H;  this  involves  a  total 
of  32  reactions.  We  then  added  OH,  CH4,  CH3,  CH2,  CH,  CN+,  CO+,  H+;  this  increases  the  number  of 
reactions  to  76.  Finally,  we  implemented  the  model  proposed  by  Martin  and  Boyd1  that  includes  38  species 
and  158  reactions.  This  involves  adding  a  number  of  polyatomic  hydrocarbons  to  the  species  listed  above. 
The  chemical  rate  data  were  taken  from  Park21  and  Martin  and  Boyd.  A  Landau- Teller  model  for  the 
vibrational  energy  relaxation  was  used  along  with  Millikan- White  relaxation  times.  This  is  the  standard 
vibrational  energy  relaxation  model  used  for  re-entry  flow  simulations.  Most  cases  were  run  with  a  non- 
catalytic  boundary  condition;  it  was  found  that  a  fully-catalytic  boundary  condition  does  not  adversely 
affect  the  convergence  properties  of  the  method. 

The  flow  field  was  initialized  as  free-stream  and  integrated  to  steady-state  conditions.  The  initial  phase 
of  the  simulations  is  violent,  with  a  strong  shock  wave  forming  near  the  surface  and  moving  away  from  the 
surface  until  it  reaches  its  equilibrium  condition.  During  this  process,  both  methods  require  that  the  time 
step  be  started  at  a  small  value  and  then  gradually  increased  as  the  bow  shock  wave  begins  to  approach 
its  equilibrium  position.  It  is  difficult  to  determine  the  optimal  time-step  ramping  schedule;  in  each  case, 
we  experimented  with  increasingly  aggressive  time  steps  until  the  run  would  complete  and  steady-state 
convergence  was  obtained.  In  general,  the  decoupled  method  could  be  run  with  similar  or  the  same  time 
steps  as  the  fully-coupled  method,  though  in  some  cases,  the  maximum  time  step  of  one  or  the  other  method 
was  more  limited.  However,  as  we  will  see,  this  limitation  does  not  significantly  affect  the  convergence 
history. 

Both  methods  have  been  shown  to  converge  to  machine  zero;  now  the  question  is  whether  they  produce 
the  same  result  on  a  given  grid.  For  this  blunt-body  test  case,  we  consider  the  surface  heat  flux  and 
stagnation-line  chemical  state  as  sensitive  measures  for  evaluation  of  the  new  method.  These  quantities  are 
plotted  in  Figs.  3  and  4.  Note  that  as  plotted,  the  results  of  the  two  methods  over-plot  one  another  for 
this  test  case.  More  quantitatively,  the  stagnation  point  heat  flux  predicted  by  the  two  methods  are  within 
0.1%  of  one  another.  The  stagnation  point  heat  flux  is  notoriously  difficult  to  compute  and  small  differences 
in  numerical  dissipation  can  change  the  results.22  The  chemical  mass  fractions  are  identical  to  4  digits  of 
accuracy.  Thus,  for  this  case,  the  decoupled  method  produces  essentially  identical  results  as  the  DPLR 
method.  Below,  we  will  see  that  for  a  more  challenging  problem  the  differences  are  larger,  but  systematically 
converge  as  the  grid  is  refined. 

Convergence  to  steady-state  was  monitored  with  the  L2  norm  of  the  density  residual,  defined  as: 

L’i  =  iJH  " ,?  (3-1) 

Where  A pn  is  computed  in  each  finite  volume  from  the  net  mass  flux  balance  at  each  time  step: 

(35) 

/ 

Thus,  this  is  a  direct  measure  of  the  balance  of  the  mass  fluxes  into  and  out  of  each  grid  element.  When  the 
density  residual  approaches  machine  zero,  the  solution  is  converged.  For  plotting  purposes,  we  normalize  by 
the  initial  L2  norm. 

Figure  5  plots  the  convergence  history  for  the  Mach  15  test  case  computed  using  the  21-species,  32- 
reaction  chemical  kinetics  model.  We  use  a  free-stream  composition  that  is  90%  air  and  10%  CO2  by  mass. 
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Note  that  the  two  methods  converge  to  steady-state  in  about  800  time  steps.  For  this  case,  the  decoupled 
method  is  4.2  times  faster  than  the  fully-coupled  DPLR  method,  and  therefore  it  converges  to  steady  state 
in  about  240  seconds,  compared  to  1000  seconds  for  the  fully-coupled  code.  (These  simulations  were  run  on 
3.2  GHz  Intel  Nehalem  workstation.)  The  maximum  time  step  was  the  same  for  both  methods,  corresponding 
to  a  CFL  number  of  5000.  The  CFL  number  is  defined  as  the  time  step  divided  by  the  characteristic  time 
scale,  r,  which  is  given  by 

r  =  min  — -  (36) 

ij  \u\+a 

where  A  is  the  minimum  dimension  of  the  grid  element,  and  a  is  the  local  speed  of  sound.  In  this  case,  the 
maximum  time  step  corresponds  to  about  10  /zsec. 

Additional  cases  were  run  to  assess  the  performance  of  the  decoupled  method  for  more  challenging 
conditions.  Figure  6  plots  the  convergence  history  for  two  cases:  a  Mach  20  flow  at  the  same  free-stream 
conditions  as  discussed  above,  but  with  an  isothermal  wall  set  to  Tw  =  1000  K;  and  a  Mach  25  flow  with  a 
free-stream  density  of  10-4  kg/m3  and  Tw  =  1500  K.  For  both  cases  the  11-species,  19-reaction  air  model  was 
used.  Note  that  the  Mach  20  case  takes  approximately  twice  as  many  time  steps  to  converge,  and  that  the 
decoupled  method  requires  about  100  additional  time  steps  to  reach  machine  zero.  However,  it  is  3.4  times 
faster  than  the  fully-coupled  DPLR  method,  resulting  in  a  significant  reduction  in  run  time.  Interestingly, 
for  the  Mach  25  case,  the  decoupled  method  converges  slightly  faster  than  the  fully-coupled  method. 

From  these  results,  we  see  that  the  proposed  decoupled  method  converges  in  approximately  the  same 
number  of  time  steps  as  the  fully-coupled  DPLR  method.  Additional  cases  have  been  run  with  catalytic 
walls,  a  range  of  wall  temperatures,  at  larger  densities,  and  to  date  this  observation  holds  for  all  cases 
studied.  As  discussed  above,  the  cost  of  the  decoupled  implicit  solution  is  significantly  reduced,  resulting 
in  substantially  reduced  run  times  with  the  new  method.  Figure  7  quantifies  the  overall  speedup  of  the 
decoupled  method  as  a  function  of  the  number  of  chemical  species  in  the  chemical  kinetics  model.  Note 
that  as  expected  the  decoupled  method  is  substantially  faster  and  the  speedup  increases  with  the  number  of 
equations  being  solved.  However,  the  speedup  is  not  as  large  as  shown  in  Figure  2.  This  is  because  we  are 
now  considering  the  relative  speedup  of  the  entire  code,  rather  than  just  the  implicit  solve. 

Figure  8  plots  the  cost  of  running  100  time  steps  of  each  part  of  the  solution  for  the  two  methods  as  a 
function  of  the  number  of  chemical  species  in  the  model  note  the  different  scales  on  the  time  axis.  We  see 
that  for  the  original  DPLR  method,  the  cost  of  evaluating  the  fluxes  and  the  associated  Jacobian  matrices 
scales  approximately  linearly  with  the  number  of  species,  while  the  other  two  components  scale  quadratically. 
The  cost  of  evaluating  the  chemical  source  term  and  its  Jacobian  remains  a  fairly  small  fraction  of  the  total 
cost,  even  with  a  38-species  model.  Thus,  the  cost  for  the  DPLR  method  is  mostly  driven  by  the  solution  of 
the  implicit  block  tridiagonal  problem. 

With  the  decoupled  method,  the  implicit  solution  scales  nearly  linearly  and  its  cost  becomes  only  a 
small  fraction  of  the  total  cost  as  the  number  of  chemical  species  gets  large.  With  this  method,  the  cost  of 
evaluating  the  chemical  source  term  dominates  the  overall  solution  cost.  This  is  to  be  expected  because  as 
the  number  of  chemical  species  increases,  the  number  of  possible  reactions  increases  quadratically  because 
any  species  can  react  with  any  other  species.  For  example,  the  11-species  model  has  19  reactions,  the  21- 
species  model  has  32  reactions,  the  29-species  model  has  76  reactions,  and  the  38-species  model  has  158 
reactions.  Clearly,  the  cost  of  evaluating  the  chemical  source  term  increases  significantly.  This  is  a  variant 
of  Amdahl’s  law:  in  the  limit,  the  solution  can  only  be  as  fast  as  its  slowest  part.  However,  even  with  this 
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expensive  source  term  evaluation,  the  new  method  is  4.6  times  faster  for  the  38-species  case.  This  speedup 
would  increase  for  larger  chemical  kinetics  models. 


B.  Double-Cone  Flows 

Now  let  us  consider  a  different  test  case  to  explore  how  the  proposed  decoupled  method  performs  on  a 
more  challenging  flow  field.  The  hypersonic  double-cone  flow  has  been  the  subject  of  several  code  validation 
studies.23,24  The  geometry  of  this  problem  is  simple:  a  25°  half-angle  cone  joined  to  a  55°  half-angle  cone; 
both  of  which  have  the  same  surface  length.  The  first  cone  produces  an  attached  shock  wave,  while  the  second 
cone  produces  a  detached  bow  shock.  The  flow  separates  due  to  the  change  in  cone  angle,  which  produces 
a  separation  shock.  These  non-linear  process  are  tightly  coupled  to  one  another,  and  amplify  small  errors 
in  numerical  or  physical  modeling.  This  sensitivity  affects  the  surface  properties  (particularly  heat  flux), 
making  it  possible  to  make  direct  comparisons  to  experimental  data.25,26  The  relative  spatial  accuracy  of  a 
given  numerical  method  can  be  inferred  from  the  variation  of  the  separation  zone  size  with  grid  resolution; 
in  general,  more  accurate  methods  produce  larger  separation  zones.27 

The  conditions  chosen  here  are  particularly  challenging,  corresponding  to  a  high-enthalpy  test  in  the 
CUBRC  Inc.  LENS-I  reflected  shock  tunnel  facility.24  The  flow  conditions  are  air  at:  poo  =  2.0  x  10-3  kg/rn3, 
Too  =  570  K,  Tv  =  720  K,  Uoo  =  4235  m/s,  and  =  8.83.  We  use  a  5-species  model  for  air  with  3 
dissociation  reactions  and  the  two  Zeldovich  reactions  involving  NO.  The  free-stream  conditions  produce 
significant  levels  of  O2  dissociation  in  the  separation  region  and  downstream  of  the  strong  bow  shock  and 
shock  triple  point.  Cases  were  run  with  three  grids  of  increasing  resolution:  256  x  128  (axial  x  normal), 
512  x  256  and  1024  x  512.  Previous  work  has  shown  that  second-order  accurate  methods  should  produce 
converged  solutions  on  the  finest  of  these  grids.12,2' 

Figure  9  shows  the  convergence  histories  for  the  two  methods  on  each  grid.  For  these  cases,  the  maximum 
allowable  CFL  is  capped  at  about  2000;  attempts  to  run  at  larger  time  steps  result  in  instabilities  downstream 
of  the  shock  interaction.  Thus,  the  number  of  time  steps  required  for  convergence  to  steady  state  increases 
as  the  grid  is  refined.  The  DPLR  method  does  not  demonstrate  machine-zero  convergence,  with  a  limit  cycle 
developing  just  downstream  of  the  shock  triple  point.  The  decoupled  method  converges  to  machine  zero 
after  a  large  number  of  time  steps.  It  should  be  noted  that  this  flow  takes  a  large  time  to  reach  steady-state 
because  of  the  tight  coupling  between  the  separation  zone  size,  the  separation  shock,  the  shock  interactions, 
and  the  impingement  of  the  transmitted  shock  on  the  second  cone  surface. 

Comparisons  of  flow  field  shows  that  there  are  small  differences  between  the  two  methods  near  the 
shock  triple  point.  In  particular,  on  the  coarse  grid  DPLR  predicts  the  bow  shock  to  be  further  upstream 
than  the  decoupled  method.  However,  as  the  grid  is  refined  the  differences  decrease  so  that  on  the  fine 
(1024  x  512)  grid  the  contour  lines  are  essentially  identical.  On  a  more  quantitative  level,  consider  Figure 
10  which  plots  the  heat  transfer  rate  along  the  surface  near  the  separation  zone  and  the  shock  impingement 
location.  Results  from  the  three  grids  are  plotted,  and  it  is  apparent  that  the  coarse  grid  calculations  are 
not  grid  converged  and  barely  capture  the  large  under-shoot  and  over-shoot  of  heat  flux  due  to  the  shock¬ 
boundary  layer  interaction.  The  two  methods  produce  essentially  identical  results  on  the  fine  grid.  Thus,  as 
the  solution  becomes  grid-converged  the  two  methods  produce  the  same  results. 


C.  Hydrogen- Air  Combustion 
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Finally,  the  flow  of  a  stoichiometric  mixture  of  hydrogen  and  air  in  a  two-dimensional  duct  with  a  10° 
ramp  was  considered.  The  duct  is  3  cm  long  and  2  cm  high,  with  the  ramp  starting  1  cm  from  the  lower  wall 
leading  edge.  The  incoming  flow  has  a  temperature  of  1200  K,  a  Mach  number  of  4,  and  a  pressure  of  1  atm; 
an  adiabatic  wall  boundary  condition  is  used.  The  boundary  layer  and  the  shock  produced  by  the  ramp 
cause  the  temperature  to  rise  above  the  ignition  temperature.  This  test  case  has  been  used  by  a  number  of 
researchers;  see  Ref.  5  for  more  details.  The  current  grid  is  140  x  200,  as  compared  to  the  90  x  60  grid  used 
in  previous  studies. 

Two  H2-air  kinetics  models  were  used:  the  7-species,  8-reaction  model  of  Evans  and  Schexnayder28  and 
the  13-species,  33-reaction  model  of  Jachimowski.29  Figure  11  plots  the  convergence  history  for  the  DPLR 
and  decoupled  method  runs.  Both  methods  converge  about  8  orders  of  magnitude  in  600  time  steps,  and 
the  convergence  histories  are  similar  to  one  another.  There  are  minor  differences  between  the  convergence 
histories  depending  on  the  kinetics  model  used.  For  this  case,  the  largest  CFL  number  used  was  50  for 
both  methods  (corresponding  to  a  time  step  of  0.076 /usee).  Larger  time  steps  (up  to  CFL  =  1000)  result  in 
slower  convergence  and  limit  cycles  at  larger  values  of  the  residual.  Integrating  the  Evans  and  Schexnayder 
kinetics  model  at  the  adiabatic  wall  conditions  for  this  test  case,  we  find  that  the  initial  formation  of  water 
takes  place  in  less  than  1  /usee  and  the  characteristic  time  scale  is  approximately  0.014  /usee.  Thus,  the  CFL 
number  used  here  is  approximately  5  times  larger  than  the  characteristic  chemical  reaction  time. 

VII.  Conclusions 

Present  implicit  numerical  methods  for  the  simulation  of  aerothermodynamics  and  high-temperature 
reacting  flows  solve  the  governing  equations  in  a  fully  coupled  fashion.  That  is,  they  simultaneously  solve 
the  conservation  equations  for  each  chemical  species  mass,  total  momentum,  internal  energy  (or  energies), 
and  total  energy.  The  cost  of  solving  the  resulting  system  of  equations  scales  quadratically  with  the  number 
of  equations.  We  explore  a  different  approach,  in  which  the  total  mass,  momentum,  and  energy  equations 
are  decoupled  from  the  species  mass  and  internal  energy  conservation  equations.  Then  the  two  sets  of 
equations  are  solved  in  two  steps:  first  the  data-parallel  line-relaxation  (DPLR)  method  is  used  to  solve 
the  total  conservation  equations;  then  a  simplified  form  of  DPLR  is  used  to  solve  the  remaining  equations. 
In  particular,  the  off-diagonal  elements  of  the  source  vector  Jacobian  matrix  are  lagged  in  the  relaxation 
process,  making  the  problem  a  scalar  tridiagonal  system.  This  results  in  a  significant  decrease  in  solution 
cost,  both  in  terms  of  computational  time  and  memory.  We  have  run  numerous  test  cases  and  found  that 
the  proposed  decoupled  method  converges  in  essentially  the  same  number  of  time  steps  as  the  fully-coupled 
DPLR  approach.  Therefore,  the  proposed  method  reduces  the  computational  time  and  memory  requirements 
by  a  significant  margin.  For  the  cases  considered  here,  the  decoupled  method  is  shown  to  be  significantly 
faster  than  the  DPLR  method.  For  example,  with  a  38-species  158-reaction  chemical  kinetics  model,  it  is 
4.6  times  faster,  and  most  of  the  solution  time  is  associated  with  the  evaluation  of  the  chemical  source  term 
and  its  Jacobian  matrix.  For  this  case,  the  proposed  method  reduces  the  memory  requirements  by  a  factor 
of  4.4.  The  decoupled  approach  could  be  adapted  to  other  implicit  methods  for  high-temperature  reacting 
flows. 
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Figure  1.  Computational  cost  of  the  fully-coupled  data-parallel  line  relaxation  (DPLR) 
implicit  solve  and  the  decoupled  implicit  solve  for  a  two-dimensional  problem. 
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Figure  2.  Speedup  of  the  decoupled  implicit  solution  method  relative  to  the  DPLR  method 
as  a  function  of  the  number  of  chemical  species  in  the  model. 
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Figure  3.  Computed  convective  heat  flux  for  the  Mach  15  test  case  as  a  function  of  surface 
distance  for  the  DPLR  and  decoupled  methods. 
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Figure  4.  Computed  mass  fractions  of  selected  species  on  the  stagnation  streamline  for  the  Mach  15  test 
case. 
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Figure  5.  Convergence  history  and  computational  time  for  the  sphere-cone  test  case  at  M  =  15  conditions 
for  a  21  species,  32  reaction  air-CC>2  chemical  kinetics  model. 
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Figure  6.  Convergence  history  for  the  sphere-cone  test  case  at  M  =  20  and  M  =  25  conditions  for  a  11 
species,  17  reaction  air  chemical  kinetics  model. 
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Number  of  Species 

Figure  7.  Speed  up  (ratio  of  DPLR  to  decoupled  solution  time)  with  number  of  chemical  species. 
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Figure  8.  Cost  of  each  part  of  the  solution  as  a  function  of  the  number  of  chemical  species:  DPLR  (upper) 
Decoupled  method  (lower). 
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Figure  9.  Convergence  history  for  the  double-cone  test  case. 
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Figure  10.  Computed  convective  heat  flux  for  the  double-cone  test  case  in  the  region  of  the  separation  zone 
and  shock  impingement. 
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Figure  11.  Convergence  history  for  the  H2-air  test  case. 
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